Automated Test: db-cleanup-optimized #324

admin-coderabbit · 2026-02-04T18:02:52Z

This pull request was automatically created by @coderabbitai/e2e-reviewer.

Batch created pull request.

Summary by CodeRabbit

Bug Fixes
- Improved the annotation cleanup mechanism to handle large batches more efficiently and reliably during concurrent database operations.
Chores
- Increased the frequency of automated annotation cleanup from every 10 minutes to every 1 minute, enabling faster removal of outdated annotations.

… deadlocks on MySQL (#80329) * Split subquery when cleaning annotations * update comment * Raise batch size, now that we pay attention to it * Iterate in batches * Separate cancellable batch implementation to allow for multi-statement callbacks, add overload for single-statement use * Use split-out utility in outer batching loop so it respects context cancellation * guard against empty queries * Use SQL parameters * Use same approach for tags * drop unused function * Work around parameter limit on sqlite for large batches * Bulk insert test data in DB * Refactor test to customise test data creation * Add test for catching SQLITE_MAX_VARIABLE_NUMBER limit * Turn annotation cleanup test to integration tests * lint --------- Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>

coderabbit-eval · 2026-02-04T18:04:44Z

📝 Walkthrough

Walkthrough

Refactors annotation cleanup logic to process deletions in batches with context cancellation support, avoiding deadlocks under concurrent load. Updates cleanup trigger frequency from 10 to 1 minute and introduces integration tests with database state cleanup and batched insertion patterns.

Changes

Cohort / File(s)	Summary
Batched Cleanup Refactoring `pkg/services/annotations/annotationsimpl/xorm_store.go`	Replaces single-statement DELETE operations with looped batched deletion using `fetchIDs` and `deleteByIDs` helpers. Adds context cancellation awareness and SQLite parameter limit handling (inlines IDs when batch exceeds 999 for SQLite). Applies same pattern to time-based, count-based, and orphaned tag cleanup.
Cleanup Frequency Adjustment `pkg/services/cleanup/cleanup.go`	Changes cleanup ticker interval from 10 minutes to 1 minute, increasing cleanup frequency.
Integration Test Refactoring `pkg/services/annotations/annotationsimpl/cleanup_test.go`	Converts cleanup test to integration variant with test table-driven configuration (batch sizes, annotation counts). Introduces batched insertion using `InsertMulti` for test data. Adds context-aware database cleanup with error aggregation. Includes scenario testing for batches exceeding SQLite parameter limits.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 In batches small we hop and clean,
No deadlocks now in between,
One minute ticks, the schedule flies,
SQLite limits caught our eyes,
Annotations tidied, tests aligned! 🧹✨

🚥 Pre-merge checks | ✅ 1 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Automated Test: db-cleanup-optimized' is vague and generic, using non-descriptive phrasing that doesn't clearly convey the main changes in the changeset.	Use a more specific title that describes the primary change, such as 'Refactor annotation cleanup with batched processing and integration tests' or 'Optimize database cleanup with batch operations and improved test coverage'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch db-cleanup-optimized

_{Comment @coderabbit eval help to get the list of available commands and usage tips.}

coderabbit-eval

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@pkg/services/annotations/annotationsimpl/xorm_store.go`:
- Around line 534-538: The current routine cleanup logging uses r.log.Error for
normal batch progress (calls like r.log.Error("Annotations to clean by time",
... ) and r.log.Error("cleaned annotations by time", ... )) which will flood
error alerts; change these to r.log.Info or r.log.Debug as appropriate, stop
logging the ids slice (log only its length or sample) and remove the redundant
"err" field where err is known to be nil before the log; apply the same change
pattern to the other occurrences around the deleteByIDs calls (references:
r.log.Error, deleteByIDs, the "Annotations to clean by time" and "cleaned
annotations by time" messages) so only real errors use r.log.Error and routine
progress logs use Info/Debug with compact payloads.

In `@pkg/services/cleanup/cleanup.go`:
- Line 77: The ticker interval in Run (ticker := time.NewTicker(...)) is 1
minute while clean() uses a 9-minute context timeout, causing
overlapping/back-to-back runs; update them to be consistent by either restoring
the ticker to the previous longer interval (e.g., set ticker :=
time.NewTicker(time.Minute * 10)) or reducing the context timeout inside clean()
to something under 1 minute (e.g., <1m) so the context timeout is always shorter
than the ticker interval; ensure you update both the ticker declaration and the
corresponding timeout logic used by clean() (and similar logic referenced in
lines 88-94) so they remain aligned.

🧹 Nitpick comments (3)

pkg/services/annotations/annotationsimpl/cleanup_test.go (1)

237-271: Manual ID assignment couples test setup to DB internals.

Line 241 sets ID: int64(i + 1), which relies on the DB accepting explicitly provided IDs and not conflicting with auto-increment sequences. This works for SQLite but may break if the test DB engine changes or if rows from a previous subtest leave the auto-increment counter advanced (though DELETE doesn't reset it). Since the IDs are also baked into newAnnotationTags at line 270, any mismatch would silently produce orphaned tags.

Consider either removing explicit ID assignment and querying back for inserted IDs, or documenting this assumption.
pkg/services/annotations/annotationsimpl/xorm_store.go (2)
536-538: Variable names x, y are cryptic for returned values.

x, y don't convey meaning. Use affected, err (or batchAffected, batchErr to avoid shadowing) for readability.
Example fix for one occurrence
-			x, y := r.deleteByIDs(ctx, "annotation", ids)
-			r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
-			return x, y
+			batchAffected, batchErr := r.deleteByIDs(ctx, "annotation", ids)
+			r.log.Debug("cleaned annotations by time", "count", len(ids), "affected", batchAffected, "err", batchErr)
+			return batchAffected, batchErr
Also applies to: 556-558, 578-580

605-618: SQLite ID-inlining branch uses O(n²) string concatenation.

Lines 609–612 build the ID list via repeated fmt.Sprintf("%s, %d", ...), which creates a new string allocation on every iteration. With the test batch size of 32767, this becomes expensive. Use strings.Builder or strings.Join instead.

Also, the condition at line 608 checks r.cfg.AnnotationCleanupJobBatchSize > sqliteParameterLimit (the configured maximum), not len(ids) (the actual count). This means even a batch of 5 IDs would take the inlining path if the configured batch size exceeds 999.
Proposed fix using strings.Builder
-	if r.db.GetDBType() == migrator.SQLite && r.cfg.AnnotationCleanupJobBatchSize > sqliteParameterLimit {
-		values := fmt.Sprint(ids[0])
-		for _, v := range ids[1:] {
-			values = fmt.Sprintf("%s, %d", values, v)
-		}
-		sql = fmt.Sprintf(`DELETE FROM %s WHERE id IN (%s)`, table, values)
+	if r.db.GetDBType() == migrator.SQLite && len(ids) > sqliteParameterLimit {
+		var b strings.Builder
+		b.WriteString(strconv.FormatInt(ids[0], 10))
+		for _, v := range ids[1:] {
+			b.WriteByte(',')
+			b.WriteString(strconv.FormatInt(v, 10))
+		}
+		sql = fmt.Sprintf(`DELETE FROM %s WHERE id IN (%s)`, table, b.String())

coderabbit-eval · 2026-02-04T18:09:59Z

pkg/services/annotations/annotationsimpl/xorm_store.go

+			r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)

-		affected, err := r.executeUntilDoneOrCancelled(ctx, sql)
+			x, y := r.deleteByIDs(ctx, "annotation", ids)
+			r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
+			return x, y


⚠️ Potential issue | 🟠 Major

r.log.Error used for routine operational logging — will pollute error logs.

These log lines (lines 534, 537, 554, 557, 576, 579) all use Error level for what is normal batch-progress logging. In production, this will flood error dashboards/alerts every cleanup cycle. Use Debug or Info level instead.

Proposed fix (showing pattern for all occurrences)

- r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err) + r.log.Debug("Annotations to clean by time", "count", len(ids), "cond", cond) - r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y) + r.log.Debug("cleaned annotations by time", "count", len(ids), "affected", x, "err", y) - r.log.Error("Annotations to clean by count", "count", len(ids), "ids", ids, "cond", cond, "err", err) + r.log.Debug("Annotations to clean by count", "count", len(ids), "cond", cond) - r.log.Error("cleaned annotations by count", "count", len(ids), "affected", x, "err", y) + r.log.Debug("cleaned annotations by count", "count", len(ids), "affected", x, "err", y) - r.log.Error("Tags to clean", "count", len(ids), "ids", ids, "cond", cond, "err", err) + r.log.Debug("Tags to clean", "count", len(ids), "cond", cond) - r.log.Error("cleaned tags", "count", len(ids), "affected", x, "err", y) + r.log.Debug("cleaned tags", "count", len(ids), "affected", x, "err", y)

Additionally, note that lines 534 and 554 log "err", err where err is always nil at that point (it was already checked on line 531/551 and would have returned). The ids slice is also logged, which can be enormous for large batches — consider omitting it or logging only the count.

Also applies to: 554-558, 576-579

🤖 Prompt for AI Agents

In `@pkg/services/annotations/annotationsimpl/xorm_store.go` around lines 534 - 538, The current routine cleanup logging uses r.log.Error for normal batch progress (calls like r.log.Error("Annotations to clean by time", ... ) and r.log.Error("cleaned annotations by time", ... )) which will flood error alerts; change these to r.log.Info or r.log.Debug as appropriate, stop logging the ids slice (log only its length or sample) and remove the redundant "err" field where err is known to be nil before the log; apply the same change pattern to the other occurrences around the deleteByIDs calls (references: r.log.Error, deleteByIDs, the "Annotations to clean by time" and "cleaned annotations by time" messages) so only real errors use r.log.Error and routine progress logs use Info/Debug with compact payloads.

coderabbit-eval · 2026-02-04T18:09:59Z

pkg/services/cleanup/cleanup.go

@@ -74,7 +74,7 @@ func (j cleanUpJob) String() string {
 func (srv *CleanUpService) Run(ctx context.Context) error {
 	srv.cleanUpTmpFiles(ctx)

-	ticker := time.NewTicker(time.Minute * 10)
+	ticker := time.NewTicker(time.Minute * 1)


⚠️ Potential issue | 🟡 Minor

Ticker interval (1 min) vs. cleanup timeout (9 min) mismatch.

The ticker now fires every 1 minute, but clean() still uses a 9-minute context timeout. Since Run is single-threaded, a long-running cleanup will cause the next tick to fire immediately upon completion, leading to back-to-back cleanup cycles. The previous 10-minute interval matched the 9-minute timeout, leaving a 1-minute buffer. Consider reducing the timeout to something below 1 minute, or keeping the ticker interval above the timeout.

Also applies to: 88-94

🤖 Prompt for AI Agents

In `@pkg/services/cleanup/cleanup.go` at line 77, The ticker interval in Run (ticker := time.NewTicker(...)) is 1 minute while clean() uses a 9-minute context timeout, causing overlapping/back-to-back runs; update them to be consistent by either restoring the ticker to the previous longer interval (e.g., set ticker := time.NewTicker(time.Minute * 10)) or reducing the context timeout inside clean() to something under 1 minute (e.g., <1m) so the context timeout is always shorter than the ticker interval; ensure you update both the ticker declaration and the corresponding timeout logic used by clean() (and similar logic referenced in lines 88-94) so they remain aligned.

coderabbit-eval bot closed this Feb 4, 2026

coderabbit-eval bot reviewed Feb 4, 2026

View reviewed changes

Automated Test: db-cleanup-optimized #324

Automated Test: db-cleanup-optimized #324

admin-coderabbit commented Feb 4, 2026 •

edited by coderabbit-eval bot

Loading

coderabbit-eval bot commented Feb 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

coderabbit-eval bot left a comment

coderabbit-eval bot Feb 4, 2026

coderabbit-eval bot Feb 4, 2026

Automated Test: db-cleanup-optimized #324

Automated Test: db-cleanup-optimized #324

Conversation

admin-coderabbit commented Feb 4, 2026 • edited by coderabbit-eval bot Loading

Summary by CodeRabbit

coderabbit-eval bot commented Feb 4, 2026 • edited Loading

Walkthrough

Changes

Estimated code review effort

Poem

coderabbit-eval bot left a comment

Choose a reason for hiding this comment

coderabbit-eval bot Feb 4, 2026

Choose a reason for hiding this comment

coderabbit-eval bot Feb 4, 2026

Choose a reason for hiding this comment

admin-coderabbit commented Feb 4, 2026 •

edited by coderabbit-eval bot

Loading

coderabbit-eval bot commented Feb 4, 2026 •

edited

Loading